Search CORE

31 research outputs found

News Comments: Exploring, Modeling, and Online Prediction

Author: Maarten De Rijke
Manos Tsagkias
Wouter Weerkamp
Publication venue
Publication date: 01/01/2010
Field of study

Abstract. Online news agents provide commenting facilities for their readers to express their opinions or sentiments with regards to news stories. The number of user supplied comments on a news article may be indicative of its importance, interestingness, or impact. We explore the news comments space, and compare the log-normal and the negative binomial distributions for modeling comments from various news agents. These estimated models can be used to normalize raw comment counts and enable comparison across different news sites. We also examine the feasibility of online prediction of the number of comments, based on the volume observed shortly after publication. We report on solid performance for predicting news comment volume in the long run, after short observation. This prediction can be useful for identifying news stories with the potential to “take off, ” and can be used to support front page optimization for news sites.

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Using term clouds to represent segment-level semantic content of podcasts

Author: Besser Jana
de Rijke Maarten
Fuller Marguerite
Jones Gareth J.F.
Larson Martha
Newman Eamonn
Tsagkias Manos
Publication venue
Publication date: 01/01/2008
Field of study

Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries

Irish Universities

DCU Online Research Access Service

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Recipient Recommendation in Enterprises using Communication Graphs and Email Content

Author: David Graus
David Van Dijk
Maarten De Rijke
Manos Tsagkias
Wouter Weerkamp
Publication venue
Publication date: 03/04/2020
Field of study

ABSTRACT We address the task of recipient recommendation for emailing in enterprises. We propose an intuitive and elegant way of modeling the task of recipient recommendation, which uses both the communication graph (i.e., who are most closely connected to the sender) and the content of the email. Additionally, the model can incorporate evidence as prior probabilities. Experiments on two enterprise email collections show that our model achieves very high scores, and that it outperforms two variants that use either the communication graph or the content in isolation

CiteSeerX

LiMoSiNe pipeline: Multilingual UIMA-based NLP platform

Author: Barlacchi Gianni
Moschitti Alessandro
Plank Barbara
Tsagkias Manos
Uryupina Olga
Uva Antonio
Valverde-Albacete Francisco J
Publication venue
Publication date: 01/01/2016
Field of study

We present a robust and efficient parallelizable multilingual UIMA-based platform for automatically annotating textual inputs with different layers of linguistic description, ranging from surface level phenomena all the way down to deep discourse-level information. In particular, given an input text, the pipeline extracts: sentences and tokens; entity mentions; syntactic information; opinionated expressions; relations between entity mentions; co-reference chains and wikified entities. The system is available in two versions: a standalone distribution enables design and optimization of userspecific sub-modules, whereas a server-client distribution allows for straightforward highperformance NLP processing, reducing the engineering cost for higher-level tasks

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Open Access Repository

Dissertations of the University of Groningen

User engagement in online News: Under the scope of sentiment, interest, affect, and gaze

Author: Abbasi
Attfield
Bai
Bandari
Bautin
Bautin
Beineke
Buscher
Cohen
Cohen
Counts
Dave
Devitt
Fischer
Galtung
Gerani
Godbole
Gregory
Gwizdka
Henderson
Kahneman
Kucuktunc
Lerman
Lopatovska
Manduchi
Manos
McCay-Peet
McCombs
O'Brien
O'Brien
O'Brien
O'Connor
Pang
Pannasch
Rayner
Remington
Shepherd
Szabo
Tatar
Thelwall
Thelwall
Thelwall
Thelwall
Tsagkias
Turney
Vural
Wang
Ward
Watson
Weber
Wu
Yi
Zhang
Zhang
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Early Detection of Topical Expertise in Community Question Answering

Author: David Van Dijk
Manos Tsagkias
Publication venue
Publication date: 06/03/2020
Field of study

ABSTRACT We focus on detecting potential topical experts in community question answering platforms early on in their lifecycle. We use a semisupervised machine learning approach. We extract three types of feature: (i) textual, (ii) behavioral, and (iii) time-aware, which we use to predict whether a user will become an expert in the longterm. We compare our method to a machine learning method based on a state-of-the-art method in expertise retrieval. Results on data from Stack Overflow demonstrate the utility of adding behavioral and time-aware features to the baseline method with a net improvement in accuracy of 26% for very early detection of expertise

CiteSeerX

Early Detection of Topical Expertise in Community Question Answering

Author: David Van Dijk
Manos Tsagkias
Publication venue
Publication date: 06/03/2020
Field of study

ABSTRACT We focus on detecting potential topical experts in community question answering platforms early on in their lifecycle. We use a semi-supervised machine learning approach. We extract three types of feature: (i) textual, (ii) behavioral, and (iii) time-aware, which we use to predict whether a user will become an expert in the longterm. We compare our method to a machine learning stateof-the-art method in expertise retrieval. Results on data from Stack Overflow demonstrate the utility of adding behavioral and timeaware features to state-of-the-art method with a net improvement in accuracy of 26% for very early detection of expertise

CiteSeerX